Skip to content

[codex] Stabilize goal extraction batch tests#7891

Merged
kodjima33 merged 3 commits into
BasedHardware:mainfrom
tianmind-studio:codex/goal-extraction-batch-tests-windows
Jun 16, 2026
Merged

[codex] Stabilize goal extraction batch tests#7891
kodjima33 merged 3 commits into
BasedHardware:mainfrom
tianmind-studio:codex/goal-extraction-batch-tests-windows

Conversation

@tianmind-studio

@tianmind-studio tianmind-studio commented Jun 13, 2026

Copy link
Copy Markdown
Contributor

Summary

  • Update test_goal_extraction_batch.py to patch the current get_llm('goals') path instead of the removed llm_mini dependency.
  • Import utils.llm.goals behind temporary dependency stubs, then restore sys.modules and parent package attrs so collection does not pollute neighboring tests.
  • Reset every stubbed database, memory, vector, usage-tracking, and LLM mock in the autouse fixture so per-test return values and side effects cannot leak across the batch suite.
  • Add test_goal_extraction_batch.py to backend/test.sh now that it passes again.

This complements #7890: that PR wires the other stable unit files into backend/test.sh, while this PR fixes the previously failing goal extraction batch test before adding it.

Validation

  • python -m pytest backend\\tests\\unit\\test_goal_extraction_batch.py -q --tb=short -> 18 passed
  • python -m pytest backend\\tests\\unit\\test_goal_extraction_batch.py backend\\tests\\unit\\test_listen_pipeline.py -q --tb=short -> 122 passed
  • python -m pytest backend\\tests\\unit\\test_goal_extraction_batch.py backend\\tests\\unit\\test_goals_id_fallback.py -q --tb=short -> 27 passed
  • python -m black --check --line-length 120 --skip-string-normalization backend\\tests\\unit\\test_goal_extraction_batch.py
  • python -m py_compile backend\\tests\\unit\\test_goal_extraction_batch.py
  • git diff --check origin/main..HEAD
  • git diff --check
  • scripts/pre-commit

@greptile-apps

greptile-apps Bot commented Jun 13, 2026

Copy link
Copy Markdown
Contributor

Greptile Summary

This PR stabilizes the previously-failing test_goal_extraction_batch.py by updating its patching strategy from the removed llm_mini attribute to the current get_llm('goals') call path, and by adding a snapshot-and-restore mechanism for sys.modules so that temporary stubs don't leak into neighboring tests.

  • Test patching strategy updated: all 18 test cases now use patch.object(goals_module, \"get_llm\", ...) instead of patch(\"utils.llm.goals.llm_mini\", ...), matching the current production code in utils/llm/goals.py.
  • Module isolation hardened: a _restore_stub_modules() function captures the pre-stub state of sys.modules and parent-package attributes at load time, then restores them after the one-time isolated import of utils.llm.goals; the live module reference is preserved in _goals_module for use throughout the test session.
  • test.sh updated: the file is added to the CI test script in the correct position between the KG-sanitization and listen-pipeline tests.

Confidence Score: 4/5

Safe to merge. The changes are confined to a test file and the CI runner script; no production code is touched.

The module-isolation machinery is well-reasoned and the 18 tests correctly verify the batched get_llm call path. The restore logic handles parent-attribute cleanup carefully. The three observations are quality improvements — none cause test failures today, but the incomplete _RESTORED_MODULES coverage and partial mock-reset scope could create subtle ordering sensitivity as the test suite grows.

The restore coverage in test_goal_extraction_batch.py (lines 55–65) is worth a second look to decide whether utils.llm and utils should be included in _RESTORED_MODULES.

Important Files Changed

Filename Overview
backend/tests/unit/test_goal_extraction_batch.py Major refactor: replaces lazy per-test imports + patch("utils.llm.goals.llm_mini") with a one-time isolated import under stubs and patch.object(goals_module, "get_llm"). Adds snapshot-and-restore machinery for sys.modules to prevent pollution of neighboring tests. Logic is sound; minor gaps in restore coverage and mock-reset scope noted.
backend/test.sh Adds pytest tests/unit/test_goal_extraction_batch.py -v in the correct position (after the KG sanitization tests, before the listen pipeline tests), consistent with the ordering in the rest of the script.

Sequence Diagram

sequenceDiagram
    participant TF as test_goal_extraction_batch.py
    participant SM as sys.modules
    participant GM as utils.llm.goals

    TF->>SM: snapshot saved_modules and saved_parent_attrs
    TF->>SM: "install stubs for database.* utils.llm.clients utils.llm.usage_tracker"
    TF->>GM: importlib.import_module
    Note over GM: binds get_llm = stub, track_usage = stub, goals_db = stub
    GM-->>TF: _goals_module reference saved
    TF->>SM: _restore_stub_modules removes stubs and restores originals

    loop each test
        TF->>TF: reset_mocks resets get_user_goals and update_goal_progress
        TF->>GM: patch.object goals_module.get_llm with MagicMock returning mock_llm
        GM->>GM: extract_and_update_goal_progress calls goals_db.get_user_goals
        GM->>GM: calls get_llm via patched mock_llm.invoke
        GM-->>TF: result dict
    end
Loading

Comments Outside Diff (1)

  1. backend/tests/unit/test_goal_extraction_batch.py, line 190-194 (link)

    P2 reset_mocks fixture only resets two of the several goals-DB mocks

    get_user_goal (singular), record_llm_usage, and the vector/chat/memory mocks are set up at module level but never reset between tests. If any future test sets a custom return_value on one of these, that value leaks into every subsequent test in the file. Expanding the fixture to call mock_goals_db.reset_mock() would eliminate the risk.

Reviews (1): Last reviewed commit: "Stabilize goal extraction batch tests" | Re-trigger Greptile

Comment thread backend/tests/unit/test_goal_extraction_batch.py
Comment thread backend/tests/unit/test_goal_extraction_batch.py

@kodjima33 kodjima33 left a comment

Copy link
Copy Markdown
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Test-only stabilization (import isolation / stubs); zero product risk, CI not failing.

@kodjima33 kodjima33 merged commit ab9ce10 into BasedHardware:main Jun 16, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants